Overview

Dataset statistics

Number of variables31
Number of observations79330
Missing cells28
Missing cells (%)< 0.1%
Duplicate rows25902
Duplicate rows (%)32.7%
Total size in memory18.8 MiB
Average record size in memory248.0 B

Variable types

Categorical20
Numeric11

Warnings

Dataset has 25902 (32.7%) duplicate rows Duplicates
Country has a high cardinality: 166 distinct values High cardinality
Agent has a high cardinality: 224 distinct values High cardinality
Company has a high cardinality: 208 distinct values High cardinality
ReservationStatusDate has a high cardinality: 864 distinct values High cardinality
ReservationStatus is highly correlated with IsCanceledHigh correlation
IsCanceled is highly correlated with ReservationStatusHigh correlation
PreviousBookingsNotCanceled is highly skewed (γ1 = 23.1165642) Skewed
ADR is highly skewed (γ1 = 23.16174514) Skewed
LeadTime has 3109 (3.9%) zeros Zeros
StaysInWeekendNights has 37817 (47.7%) zeros Zeros
StaysInWeekNights has 4963 (6.3%) zeros Zeros
PreviousCancellations has 73941 (93.2%) zeros Zeros
PreviousBookingsNotCanceled has 77742 (98.0%) zeros Zeros
BookingChanges has 69062 (87.1%) zeros Zeros
DaysInWaitingList has 75887 (95.7%) zeros Zeros
ADR has 1208 (1.5%) zeros Zeros
TotalOfSpecialRequests has 47957 (60.5%) zeros Zeros

Reproduction

Analysis started2021-03-02 23:10:15.817822
Analysis finished2021-03-02 23:11:37.553424
Duration1 minute and 21.74 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

IsCanceled
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
0
46228 
1
33102 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79330
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
046228
58.3%
133102
41.7%
2021-03-02T23:11:37.968310image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:38.103947image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
046228
58.3%
133102
41.7%

Most occurring characters

ValueCountFrequency (%)
046228
58.3%
133102
41.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79330
100.0%

Most frequent character per category

ValueCountFrequency (%)
046228
58.3%
133102
41.7%

Most occurring scripts

ValueCountFrequency (%)
Common79330
100.0%

Most frequent character per script

ValueCountFrequency (%)
046228
58.3%
133102
41.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII79330
100.0%

Most frequent character per block

ValueCountFrequency (%)
046228
58.3%
133102
41.7%

LeadTime
Real number (ℝ≥0)

ZEROS

Distinct453
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean109.7357242
Minimum0
Maximum629
Zeros3109
Zeros (%)3.9%
Memory size619.9 KiB
2021-03-02T23:11:38.243574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q123
median74
Q3163
95-th percentile334.55
Maximum629
Range629
Interquartile range (IQR)140

Descriptive statistics

Standard deviation110.9485258
Coefficient of variation (CV)1.011052021
Kurtosis1.804542062
Mean109.7357242
Median Absolute Deviation (MAD)61
Skewness1.389834639
Sum8705335
Variance12309.57537
MonotocityNot monotonic
2021-03-02T23:11:38.415115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03109
 
3.9%
11865
 
2.4%
21130
 
1.4%
41052
 
1.3%
31022
 
1.3%
5965
 
1.2%
6907
 
1.1%
7758
 
1.0%
8750
 
0.9%
12736
 
0.9%
Other values (443)67036
84.5%
ValueCountFrequency (%)
03109
3.9%
11865
2.4%
21130
 
1.4%
31022
 
1.3%
41052
 
1.3%
ValueCountFrequency (%)
62917
< 0.1%
62630
< 0.1%
62217
< 0.1%
61517
< 0.1%
60817
< 0.1%

ArrivalDateYear
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
2016
38140 
2017
27508 
2015
13682 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters317320
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015
ValueCountFrequency (%)
201638140
48.1%
201727508
34.7%
201513682
 
17.2%
2021-03-02T23:11:39.202012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:39.333660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
201638140
48.1%
201727508
34.7%
201513682
 
17.2%

Most occurring characters

ValueCountFrequency (%)
279330
25.0%
079330
25.0%
179330
25.0%
638140
12.0%
727508
 
8.7%
513682
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number317320
100.0%

Most frequent character per category

ValueCountFrequency (%)
279330
25.0%
079330
25.0%
179330
25.0%
638140
12.0%
727508
 
8.7%
513682
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common317320
100.0%

Most frequent character per script

ValueCountFrequency (%)
279330
25.0%
079330
25.0%
179330
25.0%
638140
12.0%
727508
 
8.7%
513682
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII317320
100.0%

Most frequent character per block

ValueCountFrequency (%)
279330
25.0%
079330
25.0%
179330
25.0%
638140
12.0%
727508
 
8.7%
513682
 
4.3%

ArrivalDateMonth
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
August
8983 
May
8232 
July
8088 
June
7894 
October
7605 
Other values (7)
38528 

Length

Max length9
Median length6
Mean length5.872066053
Min length3

Characters and Unicode

Total characters465831
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJuly
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowJuly
ValueCountFrequency (%)
August8983
11.3%
May8232
10.4%
July8088
10.2%
June7894
10.0%
October7605
9.6%
April7480
9.4%
September7400
9.3%
March6458
8.1%
February4965
6.3%
November4357
5.5%
Other values (2)7868
9.9%
2021-03-02T23:11:39.737578image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
august8983
11.3%
may8232
10.4%
july8088
10.2%
june7894
10.0%
october7605
9.6%
april7480
9.4%
september7400
9.3%
march6458
8.1%
february4965
6.3%
november4357
5.5%
Other values (2)7868
9.9%

Most occurring characters

ValueCountFrequency (%)
e63774
13.7%
r51098
 
11.0%
u42649
 
9.2%
b28459
 
6.1%
a27127
 
5.8%
y25021
 
5.4%
t23988
 
5.1%
J19718
 
4.2%
c18195
 
3.9%
A16463
 
3.5%
Other values (16)149339
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter386501
83.0%
Uppercase Letter79330
 
17.0%

Most frequent character per category

ValueCountFrequency (%)
e63774
16.5%
r51098
13.2%
u42649
11.0%
b28459
 
7.4%
a27127
 
7.0%
y25021
 
6.5%
t23988
 
6.2%
c18195
 
4.7%
m15889
 
4.1%
l15568
 
4.0%
Other values (8)74733
19.3%
ValueCountFrequency (%)
J19718
24.9%
A16463
20.8%
M14690
18.5%
O7605
 
9.6%
S7400
 
9.3%
F4965
 
6.3%
N4357
 
5.5%
D4132
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Latin465831
100.0%

Most frequent character per script

ValueCountFrequency (%)
e63774
13.7%
r51098
 
11.0%
u42649
 
9.2%
b28459
 
6.1%
a27127
 
5.8%
y25021
 
5.4%
t23988
 
5.1%
J19718
 
4.2%
c18195
 
3.9%
A16463
 
3.5%
Other values (16)149339
32.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII465831
100.0%

Most frequent character per block

ValueCountFrequency (%)
e63774
13.7%
r51098
 
11.0%
u42649
 
9.2%
b28459
 
6.1%
a27127
 
5.8%
y25021
 
5.4%
t23988
 
5.1%
J19718
 
4.2%
c18195
 
3.9%
A16463
 
3.5%
Other values (16)149339
32.1%

ArrivalDateWeekNumber
Real number (ℝ≥0)

Distinct53
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.17744863
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Memory size619.9 KiB
2021-03-02T23:11:39.992898image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q117
median27
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)21

Descriptive statistics

Standard deviation13.39852289
Coefficient of variation (CV)0.4930014982
Kurtosis-0.9641446539
Mean27.17744863
Median Absolute Deviation (MAD)11
Skewness-0.00974797619
Sum2155987
Variance179.5204157
MonotocityNot monotonic
2021-03-02T23:11:40.255194image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
332383
 
3.0%
422032
 
2.6%
202016
 
2.5%
302011
 
2.5%
321991
 
2.5%
171965
 
2.5%
251959
 
2.5%
341950
 
2.5%
211948
 
2.5%
181914
 
2.4%
Other values (43)59161
74.6%
ValueCountFrequency (%)
1704
0.9%
2761
1.0%
3766
1.0%
4968
1.2%
5886
1.1%
ValueCountFrequency (%)
531198
1.5%
52603
0.8%
51503
0.6%
501056
1.3%
491066
1.3%

ArrivalDateDayOfMonth
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.78662549
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size619.9 KiB
2021-03-02T23:11:40.510512image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.728450546
Coefficient of variation (CV)0.5529016035
Kurtosis-1.181540608
Mean15.78662549
Median Absolute Deviation (MAD)8
Skewness-0.006601999321
Sum1252353
Variance76.18584893
MonotocityNot monotonic
2021-03-02T23:11:40.711974image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
173012
 
3.8%
152874
 
3.6%
52857
 
3.6%
202835
 
3.6%
252823
 
3.6%
262748
 
3.5%
192747
 
3.5%
92692
 
3.4%
82676
 
3.4%
282670
 
3.4%
Other values (21)51396
64.8%
ValueCountFrequency (%)
12332
2.9%
22667
3.4%
32530
3.2%
42461
3.1%
52857
3.6%
ValueCountFrequency (%)
311352
1.7%
302380
3.0%
292366
3.0%
282670
3.4%
272537
3.2%

StaysInWeekendNights
Real number (ℝ≥0)

ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7951846716
Minimum0
Maximum16
Zeros37817
Zeros (%)47.7%
Memory size619.9 KiB
2021-03-02T23:11:40.948342image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum16
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.8850263828
Coefficient of variation (CV)1.112982197
Kurtosis5.003917041
Mean0.7951846716
Median Absolute Deviation (MAD)1
Skewness1.133843442
Sum63082
Variance0.7832716982
MonotocityNot monotonic
2021-03-02T23:11:41.142821image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
037817
47.7%
121434
27.0%
219333
24.4%
3323
 
0.4%
4297
 
0.4%
544
 
0.1%
640
 
0.1%
824
 
< 0.1%
76
 
< 0.1%
96
 
< 0.1%
Other values (4)6
 
< 0.1%
ValueCountFrequency (%)
037817
47.7%
121434
27.0%
219333
24.4%
3323
 
0.4%
4297
 
0.4%
ValueCountFrequency (%)
161
 
< 0.1%
142
 
< 0.1%
131
 
< 0.1%
102
 
< 0.1%
96
< 0.1%

StaysInWeekNights
Real number (ℝ≥0)

ZEROS

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.182957267
Minimum0
Maximum41
Zeros4963
Zeros (%)6.3%
Memory size619.9 KiB
2021-03-02T23:11:41.402128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum41
Range41
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.456416186
Coefficient of variation (CV)0.6671757657
Kurtosis32.49062966
Mean2.182957267
Median Absolute Deviation (MAD)1
Skewness2.955993664
Sum173174
Variance2.121148108
MonotocityNot monotonic
2021-03-02T23:11:41.594615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
226403
33.3%
121088
26.6%
316371
20.6%
46141
 
7.7%
04963
 
6.3%
53266
 
4.1%
6379
 
0.5%
7201
 
0.3%
8160
 
0.2%
10145
 
0.2%
Other values (19)213
 
0.3%
ValueCountFrequency (%)
04963
 
6.3%
121088
26.6%
226403
33.3%
316371
20.6%
46141
 
7.7%
ValueCountFrequency (%)
411
< 0.1%
351
< 0.1%
341
< 0.1%
301
< 0.1%
251
< 0.1%

Adults
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
2
58255 
1
15879 
3
 
4775
0
 
390
4
 
31

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79330
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row2
5th row2
ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%
2021-03-02T23:11:42.139156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:42.324855image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79330
100.0%

Most frequent character per category

ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common79330
100.0%

Most frequent character per script

ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII79330
100.0%

Most frequent character per block

ValueCountFrequency (%)
258255
73.4%
115879
 
20.0%
34775
 
6.0%
0390
 
0.5%
431
 
< 0.1%

Children
Categorical

Distinct4
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size619.9 KiB
0.0
74220 
1.0
 
3023
2.0
 
2024
3.0
 
59

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters237978
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.074220
93.6%
1.03023
 
3.8%
2.02024
 
2.6%
3.059
 
0.1%
(Missing)4
 
< 0.1%
2021-03-02T23:11:42.909291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:43.066869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.074220
93.6%
1.03023
 
3.8%
2.02024
 
2.6%
3.059
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0153546
64.5%
.79326
33.3%
13023
 
1.3%
22024
 
0.9%
359
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number158652
66.7%
Other Punctuation79326
33.3%

Most frequent character per category

ValueCountFrequency (%)
0153546
96.8%
13023
 
1.9%
22024
 
1.3%
359
 
< 0.1%
ValueCountFrequency (%)
.79326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common237978
100.0%

Most frequent character per script

ValueCountFrequency (%)
0153546
64.5%
.79326
33.3%
13023
 
1.3%
22024
 
0.9%
359
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII237978
100.0%

Most frequent character per block

ValueCountFrequency (%)
0153546
64.5%
.79326
33.3%
13023
 
1.3%
22024
 
0.9%
359
 
< 0.1%

Babies
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
0
78961 
1
 
361
2
 
6
9
 
1
10
 
1

Length

Max length2
Median length1
Mean length1.000012606
Min length1

Characters and Unicode

Total characters79331
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
078961
99.5%
1361
 
0.5%
26
 
< 0.1%
91
 
< 0.1%
101
 
< 0.1%
2021-03-02T23:11:43.533622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:43.676269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
078961
99.5%
1361
 
0.5%
26
 
< 0.1%
91
 
< 0.1%
101
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
078962
99.5%
1362
 
0.5%
26
 
< 0.1%
91
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79331
100.0%

Most frequent character per category

ValueCountFrequency (%)
078962
99.5%
1362
 
0.5%
26
 
< 0.1%
91
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common79331
100.0%

Most frequent character per script

ValueCountFrequency (%)
078962
99.5%
1362
 
0.5%
26
 
< 0.1%
91
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII79331
100.0%

Most frequent character per block

ValueCountFrequency (%)
078962
99.5%
1362
 
0.5%
26
 
< 0.1%
91
 
< 0.1%

Meal
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
BB
62305 
SC
10564 
HB
6417 
FB
 
44

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters713970
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHB
2nd rowBB
3rd rowBB
4th rowBB
5th rowBB
ValueCountFrequency (%)
BB 62305
78.5%
SC 10564
 
13.3%
HB 6417
 
8.1%
FB 44
 
0.1%
2021-03-02T23:11:44.102099image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:44.232789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
bb62305
78.5%
sc10564
 
13.3%
hb6417
 
8.1%
fb44
 
0.1%

Most occurring characters

ValueCountFrequency (%)
555310
77.8%
B131071
 
18.4%
S10564
 
1.5%
C10564
 
1.5%
H6417
 
0.9%
F44
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator555310
77.8%
Uppercase Letter158660
 
22.2%

Most frequent character per category

ValueCountFrequency (%)
B131071
82.6%
S10564
 
6.7%
C10564
 
6.7%
H6417
 
4.0%
F44
 
< 0.1%
ValueCountFrequency (%)
555310
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common555310
77.8%
Latin158660
 
22.2%

Most frequent character per script

ValueCountFrequency (%)
B131071
82.6%
S10564
 
6.7%
C10564
 
6.7%
H6417
 
4.0%
F44
 
< 0.1%
ValueCountFrequency (%)
555310
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII713970
100.0%

Most frequent character per block

ValueCountFrequency (%)
555310
77.8%
B131071
 
18.4%
S10564
 
1.5%
C10564
 
1.5%
H6417
 
0.9%
F44
 
< 0.1%

Country
Categorical

HIGH CARDINALITY

Distinct166
Distinct (%)0.2%
Missing24
Missing (%)< 0.1%
Memory size619.9 KiB
PRT
30960 
FRA
8804 
DEU
6084 
GBR
5315 
ESP
4611 
Other values (161)
23532 

Length

Max length3
Median length3
Mean length2.992825259
Min length2

Characters and Unicode

Total characters237349
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowPRT
4th rowPRT
5th rowPRT
ValueCountFrequency (%)
PRT30960
39.0%
FRA8804
 
11.1%
DEU6084
 
7.7%
GBR5315
 
6.7%
ESP4611
 
5.8%
ITA3307
 
4.2%
BEL1894
 
2.4%
BRA1794
 
2.3%
USA1618
 
2.0%
NLD1590
 
2.0%
Other values (156)13329
16.8%
2021-03-02T23:11:44.701535image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt30960
39.0%
fra8804
 
11.1%
deu6084
 
7.7%
gbr5315
 
6.7%
esp4611
 
5.8%
ita3307
 
4.2%
bel1894
 
2.4%
bra1794
 
2.3%
usa1618
 
2.0%
nld1590
 
2.0%
Other values (156)13329
16.8%

Most occurring characters

ValueCountFrequency (%)
R51360
21.6%
P36528
15.4%
T35836
15.1%
A18069
 
7.6%
E15087
 
6.4%
U10706
 
4.5%
B9307
 
3.9%
F9172
 
3.9%
S8793
 
3.7%
D8323
 
3.5%
Other values (16)34168
14.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter237349
100.0%

Most frequent character per category

ValueCountFrequency (%)
R51360
21.6%
P36528
15.4%
T35836
15.1%
A18069
 
7.6%
E15087
 
6.4%
U10706
 
4.5%
B9307
 
3.9%
F9172
 
3.9%
S8793
 
3.7%
D8323
 
3.5%
Other values (16)34168
14.4%

Most occurring scripts

ValueCountFrequency (%)
Latin237349
100.0%

Most frequent character per script

ValueCountFrequency (%)
R51360
21.6%
P36528
15.4%
T35836
15.1%
A18069
 
7.6%
E15087
 
6.4%
U10706
 
4.5%
B9307
 
3.9%
F9172
 
3.9%
S8793
 
3.7%
D8323
 
3.5%
Other values (16)34168
14.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII237349
100.0%

Most frequent character per block

ValueCountFrequency (%)
R51360
21.6%
P36528
15.4%
T35836
15.1%
A18069
 
7.6%
E15087
 
6.4%
U10706
 
4.5%
B9307
 
3.9%
F9172
 
3.9%
S8793
 
3.7%
D8323
 
3.5%
Other values (16)34168
14.4%

MarketSegment
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
Online TA
38748 
Offline TA/TO
16747 
Groups
13975 
Direct
6093 
Corporate
 
2986
Other values (3)
 
781

Length

Max length13
Median length9
Mean length9.109857557
Min length6

Characters and Unicode

Total characters722685
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOffline TA/TO
2nd rowOnline TA
3rd rowOnline TA
4th rowOnline TA
5th rowOnline TA
ValueCountFrequency (%)
Online TA38748
48.8%
Offline TA/TO16747
21.1%
Groups13975
 
17.6%
Direct6093
 
7.7%
Corporate2986
 
3.8%
Complementary542
 
0.7%
Aviation237
 
0.3%
Undefined2
 
< 0.1%
2021-03-02T23:11:45.151365image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:45.305950image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ta38748
28.7%
online38748
28.7%
offline16747
12.4%
ta/to16747
12.4%
groups13975
 
10.4%
direct6093
 
4.5%
corporate2986
 
2.2%
complementary542
 
0.4%
aviation237
 
0.2%
undefined2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n95026
13.1%
O72242
10.0%
T72242
10.0%
e65662
9.1%
i62064
8.6%
l56037
7.8%
A55732
7.7%
55495
7.7%
f33496
 
4.6%
r26582
 
3.7%
Other values (16)128107
17.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter426629
59.0%
Uppercase Letter223814
31.0%
Space Separator55495
 
7.7%
Other Punctuation16747
 
2.3%

Most frequent character per category

ValueCountFrequency (%)
n95026
22.3%
e65662
15.4%
i62064
14.5%
l56037
13.1%
f33496
 
7.9%
r26582
 
6.2%
o20726
 
4.9%
p17503
 
4.1%
u13975
 
3.3%
s13975
 
3.3%
Other values (7)21583
 
5.1%
ValueCountFrequency (%)
O72242
32.3%
T72242
32.3%
A55732
24.9%
G13975
 
6.2%
D6093
 
2.7%
C3528
 
1.6%
U2
 
< 0.1%
ValueCountFrequency (%)
55495
100.0%
ValueCountFrequency (%)
/16747
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin650443
90.0%
Common72242
 
10.0%

Most frequent character per script

ValueCountFrequency (%)
n95026
14.6%
O72242
11.1%
T72242
11.1%
e65662
10.1%
i62064
9.5%
l56037
8.6%
A55732
8.6%
f33496
 
5.1%
r26582
 
4.1%
o20726
 
3.2%
Other values (14)90634
13.9%
ValueCountFrequency (%)
55495
76.8%
/16747
 
23.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII722685
100.0%

Most frequent character per block

ValueCountFrequency (%)
n95026
13.1%
O72242
10.0%
T72242
10.0%
e65662
9.1%
i62064
8.6%
l56037
7.8%
A55732
7.7%
55495
7.7%
f33496
 
4.6%
r26582
 
3.7%
Other values (16)128107
17.7%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
TA/TO
68945 
Direct
 
6780
Corporate
 
3408
GDS
 
193
Undefined
 
4

Length

Max length9
Median length5
Mean length5.252640867
Min length3

Characters and Unicode

Total characters416692
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTA/TO
2nd rowTA/TO
3rd rowTA/TO
4th rowTA/TO
5th rowTA/TO
ValueCountFrequency (%)
TA/TO68945
86.9%
Direct6780
 
8.5%
Corporate3408
 
4.3%
GDS193
 
0.2%
Undefined4
 
< 0.1%
2021-03-02T23:11:45.887363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:46.024995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ta/to68945
86.9%
direct6780
 
8.5%
corporate3408
 
4.3%
gds193
 
0.2%
undefined4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
T137890
33.1%
A68945
16.5%
/68945
16.5%
O68945
16.5%
r13596
 
3.3%
e10196
 
2.4%
t10188
 
2.4%
D6973
 
1.7%
o6816
 
1.6%
i6784
 
1.6%
Other values (10)17414
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter286551
68.8%
Other Punctuation68945
 
16.5%
Lowercase Letter61196
 
14.7%

Most frequent character per category

ValueCountFrequency (%)
r13596
22.2%
e10196
16.7%
t10188
16.6%
o6816
11.1%
i6784
11.1%
c6780
11.1%
p3408
 
5.6%
a3408
 
5.6%
n8
 
< 0.1%
d8
 
< 0.1%
ValueCountFrequency (%)
T137890
48.1%
A68945
24.1%
O68945
24.1%
D6973
 
2.4%
C3408
 
1.2%
G193
 
0.1%
S193
 
0.1%
U4
 
< 0.1%
ValueCountFrequency (%)
/68945
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin347747
83.5%
Common68945
 
16.5%

Most frequent character per script

ValueCountFrequency (%)
T137890
39.7%
A68945
19.8%
O68945
19.8%
r13596
 
3.9%
e10196
 
2.9%
t10188
 
2.9%
D6973
 
2.0%
o6816
 
2.0%
i6784
 
2.0%
c6780
 
1.9%
Other values (9)10634
 
3.1%
ValueCountFrequency (%)
/68945
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII416692
100.0%

Most frequent character per block

ValueCountFrequency (%)
T137890
33.1%
A68945
16.5%
/68945
16.5%
O68945
16.5%
r13596
 
3.3%
e10196
 
2.4%
t10188
 
2.4%
D6973
 
1.7%
o6816
 
1.6%
i6784
 
1.6%
Other values (10)17414
 
4.2%

IsRepeatedGuest
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
0
77298 
1
 
2032

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79330
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%
2021-03-02T23:11:46.502719image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:46.640351image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%

Most occurring characters

ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79330
100.0%

Most frequent character per category

ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common79330
100.0%

Most frequent character per script

ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII79330
100.0%

Most frequent character per block

ValueCountFrequency (%)
077298
97.4%
12032
 
2.6%

PreviousCancellations
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.07974284634
Minimum0
Maximum21
Zeros73941
Zeros (%)93.2%
Memory size619.9 KiB
2021-03-02T23:11:46.762026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4154722143
Coefficient of variation (CV)5.210150294
Kurtosis451.5578665
Mean0.07974284634
Median Absolute Deviation (MAD)0
Skewness16.58489813
Sum6326
Variance0.1726171609
MonotocityNot monotonic
2021-03-02T23:11:46.907636image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
073941
93.2%
15155
 
6.5%
272
 
0.1%
351
 
0.1%
1135
 
< 0.1%
425
 
< 0.1%
622
 
< 0.1%
516
 
< 0.1%
1312
 
< 0.1%
211
 
< 0.1%
ValueCountFrequency (%)
073941
93.2%
15155
 
6.5%
272
 
0.1%
351
 
0.1%
425
 
< 0.1%
ValueCountFrequency (%)
211
 
< 0.1%
1312
 
< 0.1%
1135
< 0.1%
622
< 0.1%
516
< 0.1%

PreviousBookingsNotCanceled
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.132371108
Minimum0
Maximum72
Zeros77742
Zeros (%)98.0%
Memory size619.9 KiB
2021-03-02T23:11:47.101119image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.693410601
Coefficient of variation (CV)12.79290191
Kurtosis690.7081155
Mean0.132371108
Median Absolute Deviation (MAD)0
Skewness23.1165642
Sum10501
Variance2.867639463
MonotocityNot monotonic
2021-03-02T23:11:47.308587image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
077742
98.0%
1569
 
0.7%
2192
 
0.2%
3129
 
0.2%
4102
 
0.1%
590
 
0.1%
659
 
0.1%
751
 
0.1%
837
 
< 0.1%
936
 
< 0.1%
Other values (63)323
 
0.4%
ValueCountFrequency (%)
077742
98.0%
1569
 
0.7%
2192
 
0.2%
3129
 
0.2%
4102
 
0.1%
ValueCountFrequency (%)
721
< 0.1%
711
< 0.1%
701
< 0.1%
691
< 0.1%
681
< 0.1%

ReservedRoomType
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
A
62595 
D
11768 
F
 
1791
E
 
1553
B
 
1115
Other values (3)
 
508

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters1269280
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA
ValueCountFrequency (%)
A 62595
78.9%
D 11768
 
14.8%
F 1791
 
2.3%
E 1553
 
2.0%
B 1115
 
1.4%
G 484
 
0.6%
C 14
 
< 0.1%
P 10
 
< 0.1%
2021-03-02T23:11:47.723452image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:47.870062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a62595
78.9%
d11768
 
14.8%
f1791
 
2.3%
e1553
 
2.0%
b1115
 
1.4%
g484
 
0.6%
c14
 
< 0.1%
p10
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1189950
93.8%
A62595
 
4.9%
D11768
 
0.9%
F1791
 
0.1%
E1553
 
0.1%
B1115
 
0.1%
G484
 
< 0.1%
C14
 
< 0.1%
P10
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator1189950
93.8%
Uppercase Letter79330
 
6.2%

Most frequent character per category

ValueCountFrequency (%)
A62595
78.9%
D11768
 
14.8%
F1791
 
2.3%
E1553
 
2.0%
B1115
 
1.4%
G484
 
0.6%
C14
 
< 0.1%
P10
 
< 0.1%
ValueCountFrequency (%)
1189950
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1189950
93.8%
Latin79330
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
A62595
78.9%
D11768
 
14.8%
F1791
 
2.3%
E1553
 
2.0%
B1115
 
1.4%
G484
 
0.6%
C14
 
< 0.1%
P10
 
< 0.1%
ValueCountFrequency (%)
1189950
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1269280
100.0%

Most frequent character per block

ValueCountFrequency (%)
1189950
93.8%
A62595
 
4.9%
D11768
 
0.9%
F1791
 
0.1%
E1553
 
0.1%
B1115
 
0.1%
G484
 
< 0.1%
C14
 
< 0.1%
P10
 
< 0.1%

AssignedRoomType
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
A
57007 
D
14983 
E
 
2168
F
 
2018
B
 
2004
Other values (4)
 
1150

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters1269280
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA
ValueCountFrequency (%)
A 57007
71.9%
D 14983
 
18.9%
E 2168
 
2.7%
F 2018
 
2.5%
B 2004
 
2.5%
G 700
 
0.9%
K 279
 
0.4%
C 161
 
0.2%
P 10
 
< 0.1%
2021-03-02T23:11:48.437577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:48.577202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a57007
71.9%
d14983
 
18.9%
e2168
 
2.7%
f2018
 
2.5%
b2004
 
2.5%
g700
 
0.9%
k279
 
0.4%
c161
 
0.2%
p10
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1189950
93.8%
A57007
 
4.5%
D14983
 
1.2%
E2168
 
0.2%
F2018
 
0.2%
B2004
 
0.2%
G700
 
0.1%
K279
 
< 0.1%
C161
 
< 0.1%
P10
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator1189950
93.8%
Uppercase Letter79330
 
6.2%

Most frequent character per category

ValueCountFrequency (%)
A57007
71.9%
D14983
 
18.9%
E2168
 
2.7%
F2018
 
2.5%
B2004
 
2.5%
G700
 
0.9%
K279
 
0.4%
C161
 
0.2%
P10
 
< 0.1%
ValueCountFrequency (%)
1189950
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1189950
93.8%
Latin79330
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
A57007
71.9%
D14983
 
18.9%
E2168
 
2.7%
F2018
 
2.5%
B2004
 
2.5%
G700
 
0.9%
K279
 
0.4%
C161
 
0.2%
P10
 
< 0.1%
ValueCountFrequency (%)
1189950
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1269280
100.0%

Most frequent character per block

ValueCountFrequency (%)
1189950
93.8%
A57007
 
4.5%
D14983
 
1.2%
E2168
 
0.2%
F2018
 
0.2%
B2004
 
0.2%
G700
 
0.1%
K279
 
< 0.1%
C161
 
< 0.1%
P10
 
< 0.1%

BookingChanges
Real number (ℝ≥0)

ZEROS

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1873692172
Minimum0
Maximum21
Zeros69062
Zeros (%)87.1%
Memory size619.9 KiB
2021-03-02T23:11:48.959149image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6086203136
Coefficient of variation (CV)3.248240681
Kurtosis118.5593153
Mean0.1873692172
Median Absolute Deviation (MAD)0
Skewness7.245915202
Sum14864
Variance0.3704186862
MonotocityNot monotonic
2021-03-02T23:11:49.112739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
069062
87.1%
17232
 
9.1%
22244
 
2.8%
3467
 
0.6%
4194
 
0.2%
546
 
0.1%
631
 
< 0.1%
719
 
< 0.1%
89
 
< 0.1%
145
 
< 0.1%
Other values (11)21
 
< 0.1%
ValueCountFrequency (%)
069062
87.1%
17232
 
9.1%
22244
 
2.8%
3467
 
0.6%
4194
 
0.2%
ValueCountFrequency (%)
211
< 0.1%
201
< 0.1%
181
< 0.1%
171
< 0.1%
161
< 0.1%

DepositType
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
No Deposit
66442 
Non Refund
12868 
Refundable
 
20

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters1189950
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Deposit
2nd rowNo Deposit
3rd rowNo Deposit
4th rowNo Deposit
5th rowNo Deposit
ValueCountFrequency (%)
No Deposit 66442
83.8%
Non Refund 12868
 
16.2%
Refundable 20
 
< 0.1%
2021-03-02T23:11:49.534608image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:49.668251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
deposit66442
41.9%
no66442
41.9%
non12868
 
8.1%
refund12868
 
8.1%
refundable20
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
475960
40.0%
o145752
 
12.2%
e79350
 
6.7%
N79310
 
6.7%
D66442
 
5.6%
p66442
 
5.6%
s66442
 
5.6%
i66442
 
5.6%
t66442
 
5.6%
n25756
 
2.2%
Other values (7)51612
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter555350
46.7%
Space Separator475960
40.0%
Uppercase Letter158640
 
13.3%

Most frequent character per category

ValueCountFrequency (%)
o145752
26.2%
e79350
14.3%
p66442
12.0%
s66442
12.0%
i66442
12.0%
t66442
12.0%
n25756
 
4.6%
f12888
 
2.3%
u12888
 
2.3%
d12888
 
2.3%
Other values (3)60
 
< 0.1%
ValueCountFrequency (%)
N79310
50.0%
D66442
41.9%
R12888
 
8.1%
ValueCountFrequency (%)
475960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin713990
60.0%
Common475960
40.0%

Most frequent character per script

ValueCountFrequency (%)
o145752
20.4%
e79350
11.1%
N79310
11.1%
D66442
9.3%
p66442
9.3%
s66442
9.3%
i66442
9.3%
t66442
9.3%
n25756
 
3.6%
R12888
 
1.8%
Other values (6)38724
 
5.4%
ValueCountFrequency (%)
475960
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1189950
100.0%

Most frequent character per block

ValueCountFrequency (%)
475960
40.0%
o145752
 
12.2%
e79350
 
6.7%
N79310
 
6.7%
D66442
 
5.6%
p66442
 
5.6%
s66442
 
5.6%
i66442
 
5.6%
t66442
 
5.6%
n25756
 
2.2%
Other values (7)51612
 
4.3%

Agent
Categorical

HIGH CARDINALITY

Distinct224
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
9
31955 
NULL
8131 
1
7137 
14
3640 
7
3539 
Other values (219)
24928 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters872630
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)< 0.1%

Sample

1st row 6
2nd row 9
3rd row 9
4th row 9
5th row 9
ValueCountFrequency (%)
931955
40.3%
NULL8131
 
10.2%
17137
 
9.0%
143640
 
4.6%
73539
 
4.5%
62683
 
3.4%
281666
 
2.1%
31308
 
1.6%
81236
 
1.6%
371230
 
1.6%
Other values (214)16805
21.2%
2021-03-02T23:11:50.118048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
931955
40.3%
null8131
 
10.2%
17137
 
9.0%
143640
 
4.6%
73539
 
4.5%
62683
 
3.4%
281666
 
2.1%
31308
 
1.6%
81236
 
1.6%
371230
 
1.6%
Other values (214)16805
21.2%

Most occurring characters

ValueCountFrequency (%)
740620
84.9%
936250
 
4.2%
119113
 
2.2%
L16262
 
1.9%
29607
 
1.1%
N8131
 
0.9%
U8131
 
0.9%
36509
 
0.7%
76396
 
0.7%
46022
 
0.7%
Other values (4)15589
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Space Separator740620
84.9%
Decimal Number99486
 
11.4%
Uppercase Letter32524
 
3.7%

Most frequent character per category

ValueCountFrequency (%)
936250
36.4%
119113
19.2%
29607
 
9.7%
36509
 
6.5%
76396
 
6.4%
46022
 
6.1%
85978
 
6.0%
64811
 
4.8%
52996
 
3.0%
01804
 
1.8%
ValueCountFrequency (%)
L16262
50.0%
N8131
25.0%
U8131
25.0%
ValueCountFrequency (%)
740620
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common840106
96.3%
Latin32524
 
3.7%

Most frequent character per script

ValueCountFrequency (%)
740620
88.2%
936250
 
4.3%
119113
 
2.3%
29607
 
1.1%
36509
 
0.8%
76396
 
0.8%
46022
 
0.7%
85978
 
0.7%
64811
 
0.6%
52996
 
0.4%
ValueCountFrequency (%)
L16262
50.0%
N8131
25.0%
U8131
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII872630
100.0%

Most frequent character per block

ValueCountFrequency (%)
740620
84.9%
936250
 
4.2%
119113
 
2.2%
L16262
 
1.9%
29607
 
1.1%
N8131
 
0.9%
U8131
 
0.9%
36509
 
0.7%
76396
 
0.7%
46022
 
0.7%
Other values (4)15589
 
1.8%

Company
Categorical

HIGH CARDINALITY

Distinct208
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
NULL
75641 
40
 
924
67
 
267
45
 
250
153
 
215
Other values (203)
 
2033

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters872630
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)0.1%

Sample

1st row NULL
2nd row NULL
3rd row NULL
4th row NULL
5th row NULL
ValueCountFrequency (%)
NULL75641
95.3%
40924
 
1.2%
67267
 
0.3%
45250
 
0.3%
153215
 
0.3%
219141
 
0.2%
233114
 
0.1%
174113
 
0.1%
5186
 
0.1%
24261
 
0.1%
Other values (198)1518
 
1.9%
2021-03-02T23:11:50.591784image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
null75641
95.3%
40924
 
1.2%
67267
 
0.3%
45250
 
0.3%
153215
 
0.3%
219141
 
0.2%
233114
 
0.1%
174113
 
0.1%
5186
 
0.1%
24261
 
0.1%
Other values (198)1518
 
1.9%

Most occurring characters

ValueCountFrequency (%)
560896
64.3%
L151282
 
17.3%
N75641
 
8.7%
U75641
 
8.7%
41844
 
0.2%
01216
 
0.1%
11154
 
0.1%
21016
 
0.1%
3973
 
0.1%
5769
 
0.1%
Other values (4)2198
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Space Separator560896
64.3%
Uppercase Letter302564
34.7%
Decimal Number9170
 
1.1%

Most frequent character per category

ValueCountFrequency (%)
41844
20.1%
01216
13.3%
11154
12.6%
21016
11.1%
3973
10.6%
5769
8.4%
7657
 
7.2%
6600
 
6.5%
8503
 
5.5%
9438
 
4.8%
ValueCountFrequency (%)
L151282
50.0%
N75641
25.0%
U75641
25.0%
ValueCountFrequency (%)
560896
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common570066
65.3%
Latin302564
34.7%

Most frequent character per script

ValueCountFrequency (%)
560896
98.4%
41844
 
0.3%
01216
 
0.2%
11154
 
0.2%
21016
 
0.2%
3973
 
0.2%
5769
 
0.1%
7657
 
0.1%
6600
 
0.1%
8503
 
0.1%
ValueCountFrequency (%)
L151282
50.0%
N75641
25.0%
U75641
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII872630
100.0%

Most frequent character per block

ValueCountFrequency (%)
560896
64.3%
L151282
 
17.3%
N75641
 
8.7%
U75641
 
8.7%
41844
 
0.2%
01216
 
0.1%
11154
 
0.1%
21016
 
0.1%
3973
 
0.1%
5769
 
0.1%
Other values (4)2198
 
0.3%

DaysInWaitingList
Real number (ℝ≥0)

ZEROS

Distinct115
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.226774234
Minimum0
Maximum391
Zeros75887
Zeros (%)95.7%
Memory size619.9 KiB
2021-03-02T23:11:50.802220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum391
Range391
Interquartile range (IQR)0

Descriptive statistics

Standard deviation20.87089
Coefficient of variation (CV)6.468035407
Kurtosis137.5299571
Mean3.226774234
Median Absolute Deviation (MAD)0
Skewness10.30524112
Sum255980
Variance435.5940492
MonotocityNot monotonic
2021-03-02T23:11:51.006673image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
075887
95.7%
39226
 
0.3%
58164
 
0.2%
44140
 
0.2%
31127
 
0.2%
3596
 
0.1%
4694
 
0.1%
6989
 
0.1%
6383
 
0.1%
8780
 
0.1%
Other values (105)2344
 
3.0%
ValueCountFrequency (%)
075887
95.7%
17
 
< 0.1%
23
 
< 0.1%
359
 
0.1%
422
 
< 0.1%
ValueCountFrequency (%)
39145
0.1%
37915
 
< 0.1%
33015
 
< 0.1%
25910
 
< 0.1%
23635
< 0.1%

CustomerType
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
Transient
59404 
Transient-Party
17333 
Contract
 
2300
Group
 
293

Length

Max length15
Median length9
Mean length10.2671877
Min length5

Characters and Unicode

Total characters814496
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransient
2nd rowTransient
3rd rowTransient
4th rowTransient
5th rowTransient
ValueCountFrequency (%)
Transient59404
74.9%
Transient-Party17333
 
21.8%
Contract2300
 
2.9%
Group293
 
0.4%
2021-03-02T23:11:51.472458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:51.591148image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
transient59404
74.9%
transient-party17333
 
21.8%
contract2300
 
2.9%
group293
 
0.4%

Most occurring characters

ValueCountFrequency (%)
n155774
19.1%
t98670
12.1%
r96663
11.9%
a96370
11.8%
T76737
9.4%
s76737
9.4%
i76737
9.4%
e76737
9.4%
-17333
 
2.1%
P17333
 
2.1%
Other values (7)25405
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter700500
86.0%
Uppercase Letter96663
 
11.9%
Dash Punctuation17333
 
2.1%

Most frequent character per category

ValueCountFrequency (%)
n155774
22.2%
t98670
14.1%
r96663
13.8%
a96370
13.8%
s76737
11.0%
i76737
11.0%
e76737
11.0%
y17333
 
2.5%
o2593
 
0.4%
c2300
 
0.3%
Other values (2)586
 
0.1%
ValueCountFrequency (%)
T76737
79.4%
P17333
 
17.9%
C2300
 
2.4%
G293
 
0.3%
ValueCountFrequency (%)
-17333
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin797163
97.9%
Common17333
 
2.1%

Most frequent character per script

ValueCountFrequency (%)
n155774
19.5%
t98670
12.4%
r96663
12.1%
a96370
12.1%
T76737
9.6%
s76737
9.6%
i76737
9.6%
e76737
9.6%
P17333
 
2.2%
y17333
 
2.2%
Other values (6)8072
 
1.0%
ValueCountFrequency (%)
-17333
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII814496
100.0%

Most frequent character per block

ValueCountFrequency (%)
n155774
19.1%
t98670
12.1%
r96663
11.9%
a96370
11.8%
T76737
9.4%
s76737
9.4%
i76737
9.4%
e76737
9.4%
-17333
 
2.1%
P17333
 
2.1%
Other values (7)25405
 
3.1%

ADR
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct5405
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean105.3044654
Minimum0
Maximum5400
Zeros1208
Zeros (%)1.5%
Memory size619.9 KiB
2021-03-02T23:11:51.799583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile62
Q179.2
median99.9
Q3126
95-th percentile175.5
Maximum5400
Range5400
Interquartile range (IQR)46.8

Descriptive statistics

Standard deviation43.60295384
Coefficient of variation (CV)0.4140655733
Kurtosis2741.52335
Mean105.3044654
Median Absolute Deviation (MAD)23.32
Skewness23.16174514
Sum8353803.24
Variance1901.217583
MonotocityNot monotonic
2021-03-02T23:11:51.979102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
623593
 
4.5%
752372
 
3.0%
902208
 
2.8%
652057
 
2.6%
951501
 
1.9%
1201458
 
1.8%
801426
 
1.8%
1001422
 
1.8%
1101382
 
1.7%
1301222
 
1.5%
Other values (5395)60689
76.5%
ValueCountFrequency (%)
01208
1.5%
0.51
 
< 0.1%
115
 
< 0.1%
1.291
 
< 0.1%
1.481
 
< 0.1%
ValueCountFrequency (%)
54001
< 0.1%
5101
< 0.1%
451.51
< 0.1%
375.51
< 0.1%
372.331
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
0
77404 
1
 
1921
2
 
3
3
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters79330
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%
2021-03-02T23:11:52.400943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:52.534616image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79330
100.0%

Most frequent character per category

ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common79330
100.0%

Most frequent character per script

ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII79330
100.0%

Most frequent character per block

ValueCountFrequency (%)
077404
97.6%
11921
 
2.4%
23
 
< 0.1%
32
 
< 0.1%

TotalOfSpecialRequests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5469179377
Minimum0
Maximum5
Zeros47957
Zeros (%)60.5%
Memory size619.9 KiB
2021-03-02T23:11:52.693193image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7807762224
Coefficient of variation (CV)1.427593005
Kurtosis1.658703225
Mean0.5469179377
Median Absolute Deviation (MAD)0
Skewness1.403768024
Sum43387
Variance0.6096115095
MonotocityNot monotonic
2021-03-02T23:11:52.823811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
047957
60.5%
121420
27.0%
28142
 
10.3%
31587
 
2.0%
4198
 
0.2%
526
 
< 0.1%
ValueCountFrequency (%)
047957
60.5%
121420
27.0%
28142
 
10.3%
31587
 
2.0%
4198
 
0.2%
ValueCountFrequency (%)
526
 
< 0.1%
4198
 
0.2%
31587
 
2.0%
28142
 
10.3%
121420
27.0%

ReservationStatus
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
Check-Out
46228 
Canceled
32186 
No-Show
 
916

Length

Max length9
Median length9
Mean length8.571183663
Min length7

Characters and Unicode

Total characters679952
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCheck-Out
2nd rowCanceled
3rd rowCanceled
4th rowCanceled
5th rowCanceled
ValueCountFrequency (%)
Check-Out46228
58.3%
Canceled32186
40.6%
No-Show916
 
1.2%
2021-03-02T23:11:53.203828image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-02T23:11:53.344419image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
check-out46228
58.3%
canceled32186
40.6%
no-show916
 
1.2%

Most occurring characters

ValueCountFrequency (%)
e110600
16.3%
C78414
11.5%
c78414
11.5%
h47144
6.9%
-47144
6.9%
k46228
6.8%
O46228
6.8%
u46228
6.8%
t46228
6.8%
a32186
 
4.7%
Other values (7)101138
14.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter506334
74.5%
Uppercase Letter126474
 
18.6%
Dash Punctuation47144
 
6.9%

Most frequent character per category

ValueCountFrequency (%)
e110600
21.8%
c78414
15.5%
h47144
9.3%
k46228
9.1%
u46228
9.1%
t46228
9.1%
a32186
 
6.4%
n32186
 
6.4%
l32186
 
6.4%
d32186
 
6.4%
Other values (2)2748
 
0.5%
ValueCountFrequency (%)
C78414
62.0%
O46228
36.6%
N916
 
0.7%
S916
 
0.7%
ValueCountFrequency (%)
-47144
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin632808
93.1%
Common47144
 
6.9%

Most frequent character per script

ValueCountFrequency (%)
e110600
17.5%
C78414
12.4%
c78414
12.4%
h47144
7.4%
k46228
7.3%
O46228
7.3%
u46228
7.3%
t46228
7.3%
a32186
 
5.1%
n32186
 
5.1%
Other values (6)68952
10.9%
ValueCountFrequency (%)
-47144
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII679952
100.0%

Most frequent character per block

ValueCountFrequency (%)
e110600
16.3%
C78414
11.5%
c78414
11.5%
h47144
6.9%
-47144
6.9%
k46228
6.8%
O46228
6.8%
u46228
6.8%
t46228
6.8%
a32186
 
4.7%
Other values (7)101138
14.9%

ReservationStatusDate
Categorical

HIGH CARDINALITY

Distinct864
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size619.9 KiB
2015-10-21
 
1416
2015-07-06
 
763
2015-01-01
 
760
2016-11-25
 
746
2016-01-18
 
553
Other values (859)
75092 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters793300
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)< 0.1%

Sample

1st row2015-07-03
2nd row2015-07-01
3rd row2015-04-30
4th row2015-06-23
5th row2015-04-02
ValueCountFrequency (%)
2015-10-211416
 
1.8%
2015-07-06763
 
1.0%
2015-01-01760
 
1.0%
2016-11-25746
 
0.9%
2016-01-18553
 
0.7%
2015-07-02455
 
0.6%
2015-12-18402
 
0.5%
2016-12-07392
 
0.5%
2017-01-24290
 
0.4%
2016-03-15285
 
0.4%
Other values (854)73268
92.4%
2021-03-02T23:11:53.769314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2015-10-211416
 
1.8%
2015-07-06763
 
1.0%
2015-01-01760
 
1.0%
2016-11-25746
 
0.9%
2016-01-18553
 
0.7%
2015-07-02455
 
0.6%
2015-12-18402
 
0.5%
2016-12-07392
 
0.5%
2017-01-24290
 
0.4%
2016-03-15285
 
0.4%
Other values (854)73268
92.4%

Most occurring characters

ValueCountFrequency (%)
0179738
22.7%
-158660
20.0%
1145481
18.3%
2125032
15.8%
653094
 
6.7%
739878
 
5.0%
530746
 
3.9%
317346
 
2.2%
815153
 
1.9%
914267
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number634640
80.0%
Dash Punctuation158660
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
0179738
28.3%
1145481
22.9%
2125032
19.7%
653094
 
8.4%
739878
 
6.3%
530746
 
4.8%
317346
 
2.7%
815153
 
2.4%
914267
 
2.2%
413905
 
2.2%
ValueCountFrequency (%)
-158660
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common793300
100.0%

Most frequent character per script

ValueCountFrequency (%)
0179738
22.7%
-158660
20.0%
1145481
18.3%
2125032
15.8%
653094
 
6.7%
739878
 
5.0%
530746
 
3.9%
317346
 
2.2%
815153
 
1.9%
914267
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII793300
100.0%

Most frequent character per block

ValueCountFrequency (%)
0179738
22.7%
-158660
20.0%
1145481
18.3%
2125032
15.8%
653094
 
6.7%
739878
 
5.0%
530746
 
3.9%
317346
 
2.2%
815153
 
1.9%
914267
 
1.8%

Interactions

2021-03-02T23:11:03.169215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:03.427519image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:03.654909image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:03.914218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:04.131635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:04.346060image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:04.586420image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:04.794861image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:05.061147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:05.297568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:05.526904image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:05.755293image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:05.988699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:06.223072image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:06.746672image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:06.945160image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:07.161582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:07.349084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:07.573439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:07.802847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:08.088056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:08.396231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:08.692436image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:09.036553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:09.325744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:09.616966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:09.947111image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:10.239300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:10.578393image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:10.855682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:11.112390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:11.361723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:11.599122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:11.844432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:12.068832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:12.293231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:12.547604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:12.762975image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:13.029294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:13.262670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:13.494020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:13.708446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:13.918884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:14.130318image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:14.350786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:14.540222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:14.756642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:14.945138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:15.173528image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:15.369004image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:15.567474image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:15.781932image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:15.992339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:16.211751image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:16.437148image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:16.630665image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:16.852801image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:17.058251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:17.287688image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:17.488152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:17.688613image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:17.930917image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:18.168333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:18.403703image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:18.657026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:18.876440image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:19.103781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:19.350121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:19.630402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:19.872722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:20.111138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:20.312723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:20.523212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:20.734627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:20.967971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:21.156497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:21.346957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:21.560437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:21.780830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:21.976273image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:22.177787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:22.439088image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:22.711309image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:23.002527image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:23.306766image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:23.579019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:23.937030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:24.360895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:24.735894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:25.108895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:25.515840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:25.821024image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:26.108222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:26.396484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:26.707619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:26.989896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:27.217257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:27.467618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:27.677028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:27.963261image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:28.192698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:28.418077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:28.631505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:28.926690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:29.192006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:29.397460image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:29.601910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:29.830300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:30.043727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-02T23:11:30.310023image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-03-02T23:11:53.951826image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-02T23:11:54.642978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-02T23:11:55.329143image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-02T23:11:56.173853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-02T23:11:57.423509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-02T23:11:30.834612image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-02T23:11:35.419319image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-03-02T23:11:36.510587image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-03-02T23:11:36.803385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

IsCanceledLeadTimeArrivalDateYearArrivalDateMonthArrivalDateWeekNumberArrivalDateDayOfMonthStaysInWeekendNightsStaysInWeekNightsAdultsChildrenBabiesMealCountryMarketSegmentDistributionChannelIsRepeatedGuestPreviousCancellationsPreviousBookingsNotCanceledReservedRoomTypeAssignedRoomTypeBookingChangesDepositTypeAgentCompanyDaysInWaitingListCustomerTypeADRRequiredCarParkingSpacesTotalOfSpecialRequestsReservationStatusReservationStatusDate
0062015July2710210.00HBPRTOffline TA/TOTA/TO000AA0No Deposit6NULL0Transient0.0000Check-Out2015-07-03
11882015July2710420.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient76.5001Canceled2015-07-01
21652015July2710410.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient68.0001Canceled2015-04-30
31922015July2712420.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient76.5002Canceled2015-06-23
411002015July2720220.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient76.5001Canceled2015-04-02
51792015July2720320.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient76.5001Canceled2015-06-25
6032015July2720310.00HBPRTGroupsTA/TO000AA1No Deposit1NULL0Transient-Party58.6700Check-Out2015-07-05
71632015July2721310.00BBPRTOnline TATA/TO000AA0No Deposit9NULL0Transient68.0000Canceled2015-06-25
81622015July2722320.00BBPRTOnline TATA/TO000AA0No Deposit8NULL0Transient76.5001No-Show2015-07-02
91622015July2722320.00BBPRTOnline TATA/TO000AA0No Deposit8NULL0Transient76.5001No-Show2015-07-02

Last rows

IsCanceledLeadTimeArrivalDateYearArrivalDateMonthArrivalDateWeekNumberArrivalDateDayOfMonthStaysInWeekendNightsStaysInWeekNightsAdultsChildrenBabiesMealCountryMarketSegmentDistributionChannelIsRepeatedGuestPreviousCancellationsPreviousBookingsNotCanceledReservedRoomTypeAssignedRoomTypeBookingChangesDepositTypeAgentCompanyDaysInWaitingListCustomerTypeADRRequiredCarParkingSpacesTotalOfSpecialRequestsReservationStatusReservationStatusDate
793200442017August35311320.00SCDEUOnline TATA/TO000AA0No Deposit9NULL0Transient140.7501Check-Out2017-09-04
7932101882017August35312320.00BBDEUDirectDirect000AA0No Deposit14NULL0Transient99.0000Check-Out2017-09-05
7932201352017August35302430.00BBJPNOnline TATA/TO000GG0No Deposit7NULL0Transient209.0000Check-Out2017-09-05
7932301642017August35312420.00BBDEUOffline TA/TOTA/TO000AA0No Deposit42NULL0Transient87.6000Check-Out2017-09-06
793240212017August35302520.00BBBELOffline TA/TOTA/TO000AA0No Deposit394NULL0Transient96.1402Check-Out2017-09-06
793250232017August35302520.00BBBELOffline TA/TOTA/TO000AA0No Deposit394NULL0Transient96.1400Check-Out2017-09-06
7932601022017August35312530.00BBFRAOnline TATA/TO000EE0No Deposit9NULL0Transient225.4302Check-Out2017-09-07
793270342017August35312520.00BBDEUOnline TATA/TO000DD0No Deposit9NULL0Transient157.7104Check-Out2017-09-07
7932801092017August35312520.00BBGBROnline TATA/TO000AA0No Deposit89NULL0Transient104.4000Check-Out2017-09-07
7932902052017August35292720.00HBDEUOnline TATA/TO000AA0No Deposit9NULL0Transient151.2002Check-Out2017-09-07

Duplicate rows

Most frequent

IsCanceledLeadTimeArrivalDateYearArrivalDateMonthArrivalDateWeekNumberArrivalDateDayOfMonthStaysInWeekendNightsStaysInWeekNightsAdultsChildrenBabiesMealCountryMarketSegmentDistributionChannelIsRepeatedGuestPreviousCancellationsPreviousBookingsNotCanceledReservedRoomTypeAssignedRoomTypeBookingChangesDepositTypeAgentCompanyDaysInWaitingListCustomerTypeADRRequiredCarParkingSpacesTotalOfSpecialRequestsReservationStatusReservationStatusDatecount
540412772016November4671220.00BBPRTGroupsTA/TO000AA0Non RefundNULLNULL0Transient100.000Canceled2016-04-04180
41811682016February8170220.00BBPRTGroupsTA/TO010AA0Non Refund37NULL0Transient75.000Canceled2016-01-06150
507511882016June25150210.00BBPRTOffline TA/TOTA/TO000AA0Non Refund119NULL39Transient130.000Canceled2016-01-18109
487911582016May22240210.00BBPRTGroupsTA/TO000AA0Non Refund37NULL31Transient130.000Canceled2016-01-18101
38501342015December5080210.00BBPRTOffline TA/TOTA/TO010AA0Non Refund19NULL0Transient90.000Canceled2015-11-17100
37921282017March920320.00BBPRTGroupsTA/TO000AA0Non RefundNULLNULL0Transient95.000Canceled2017-02-0299
39061382017January2140110.00BBPRTCorporateCorporate000AA0Non RefundNULL670Transient75.000Canceled2016-12-0799
487211562017April17260320.00BBPRTGroupsTA/TO000AA0Non Refund37NULL0Transient100.000Canceled2016-11-2199
42051712016June25140310.00BBPRTOffline TA/TOTA/TO000AA0Non Refund236NULL0Transient120.000Canceled2016-04-2789
493911662016November4510310.00BBPRTOffline TA/TOTA/TO000AA0Non Refund236NULL0Transient110.000Canceled2016-07-1385